1,247 research outputs found
Building Machines That Learn and Think Like People
Recent progress in artificial intelligence (AI) has renewed interest in
building systems that learn and think like people. Many advances have come from
using deep neural networks trained end-to-end in tasks such as object
recognition, video games, and board games, achieving performance that equals or
even beats humans in some respects. Despite their biological inspiration and
performance achievements, these systems differ from human intelligence in
crucial ways. We review progress in cognitive science suggesting that truly
human-like learning and thinking machines will have to reach beyond current
engineering trends in both what they learn, and how they learn it.
Specifically, we argue that these machines should (a) build causal models of
the world that support explanation and understanding, rather than merely
solving pattern recognition problems; (b) ground learning in intuitive theories
of physics and psychology, to support and enrich the knowledge that is learned;
and (c) harness compositionality and learning-to-learn to rapidly acquire and
generalize knowledge to new tasks and situations. We suggest concrete
challenges and promising routes towards these goals that can combine the
strengths of recent neural network advances with more structured cognitive
models.Comment: In press at Behavioral and Brain Sciences. Open call for commentary
proposals (until Nov. 22, 2016).
https://www.cambridge.org/core/journals/behavioral-and-brain-sciences/information/calls-for-commentary/open-calls-for-commentar
Perceptual multistability as Markov Chain Monte Carlo inference
While many perceptual and cognitive phenomena are well described in terms of Bayesian inference, the necessary computations are intractable at the scale of real-world tasks, and it remains unclear how the human mind approximates Bayesian computations algorithmically. We explore the proposal that for some tasks, humans use a form of Markov Chain Monte Carlo to approximate the posterior distribution over hidden variables. As a case study, we show how several phenomena of perceptual multistability can be explained as MCMC inference in simple graphical models for low-level vision
Nephrogenic Syndrome of Inappropriate Antidiuresis
Mutations in the vasopressin V2 receptor gene are responsible for two human tubular disorders: X-linked congenital nephrogenic diabetes insipidus, due to a loss of function of the mutant V2 receptor, and the nephrogenic syndrome of inappropriate antidiuresis, due to a constitutive activation of the mutant V2 receptor. This latter recently described disease may be diagnosed from infancy to adulthood, as some carriers remain asymptomatic for many years. Symptomatic children, however, typically present with clinical and biological features suggesting inappropriate antidiuretic hormone secretion with severe hyponatremia and high urine osmolality, but a low plasma arginine vasopressin level. To date, only two missense mutations in the vasopressin V2 receptor gene have been found in the reported patients. The pathophysiology of the disease requires fuller elucidation as the phenotypic variability observed in patients bearing the same mutations remains unexplained. The treatment is mainly preventive with fluid restriction, but urea may also be proposed
Fragment Grammars: Exploring Computation and Reuse in Language
Language relies on a division of labor between stored units and structure building operations which combine the stored units into larger structures. This division of labor leads to a tradeoff: more structure-building means less need to store while more storage means less need to compute structure. We develop a hierarchical Bayesian model called fragment grammar to explore the optimum balance between structure-building and reuse. The model is developed in the context of stochastic functional programming (SFP) and in particular using a probabilistic variant of Lisp known as the Church programming language (Goodman, Mansinghka, Roy, Bonawitz, & Tenenbaum, 2008). We show how to formalize several probabilistic models of language structure using Church, and how fragment grammar generalizes one of them---adaptor grammars (Johnson, Griffiths, & Goldwater, 2007). We conclude with experimental data with adults and preliminary evaluations of the model on natural language corpus data
ExplainIt! -- A declarative root-cause analysis engine for time series data (extended version)
We present ExplainIt!, a declarative, unsupervised root-cause analysis engine
that uses time series monitoring data from large complex systems such as data
centres. ExplainIt! empowers operators to succinctly specify a large number of
causal hypotheses to search for causes of interesting events. ExplainIt! then
ranks these hypotheses, reducing the number of causal dependencies from
hundreds of thousands to a handful for human understanding. We show how a
declarative language, such as SQL, can be effective in declaratively
enumerating hypotheses that probe the structure of an unknown probabilistic
graphical causal model of the underlying system. Our thesis is that databases
are in a unique position to enable users to rapidly explore the possible causal
mechanisms in data collected from diverse sources. We empirically demonstrate
how ExplainIt! had helped us resolve over 30 performance issues in a commercial
product since late 2014, of which we discuss a few cases in detail.Comment: SIGMOD Industry Track 201
Modeling human intuitions about liquid flow with particle-based simulation
Humans can easily describe, imagine, and, crucially, predict a wide variety
of behaviors of liquids--splashing, squirting, gushing, sloshing, soaking,
dripping, draining, trickling, pooling, and pouring--despite tremendous
variability in their material and dynamical properties. Here we propose and
test a computational model of how people perceive and predict these liquid
dynamics, based on coarse approximate simulations of fluids as collections of
interacting particles. Our model is analogous to a "game engine in the head",
drawing on techniques for interactive simulations (as in video games) that
optimize for efficiency and natural appearance rather than physical accuracy.
In two behavioral experiments, we found that the model accurately captured
people's predictions about how liquids flow among complex solid obstacles, and
was significantly better than two alternatives based on simple heuristics and
deep neural networks. Our model was also able to explain how people's
predictions varied as a function of the liquids' properties (e.g., viscosity
and stickiness). Together, the model and empirical results extend the recent
proposal that human physical scene understanding for the dynamics of rigid,
solid objects can be supported by approximate probabilistic simulation, to the
more complex and unexplored domain of fluid dynamics.Comment: Under review at PLOS Computational Biolog
Probing the compositionality of intuitive functions
How do people learn about complex functional structure? Taking inspiration from other areas of cognitive science, we propose that this is accomplished by harnessing compositionality: complex structure is decomposed into simpler building blocks. We formalize this idea within the framework of Bayesian regression using a grammar over Gaussian process kernels. We show that participants prefer compositional over non-compositional function extrapolations, that samples from the human prior over functions are best described by a compositional model, and that people perceive compositional functions as more predictable than their non-compositional but otherwise similar counterparts. We argue that the compositional nature of intuitive functions is consistent with broad principles of human cognition.This work was supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF – 1231216
NeuSE: Neural SE(3)-Equivariant Embedding for Consistent Spatial Understanding with Objects
We present NeuSE, a novel Neural SE(3)-Equivariant Embedding for objects, and
illustrate how it supports object SLAM for consistent spatial understanding
with long-term scene changes. NeuSE is a set of latent object embeddings
created from partial object observations. It serves as a compact point cloud
surrogate for complete object models, encoding full shape information while
transforming SE(3)-equivariantly in tandem with the object in the physical
world. With NeuSE, relative frame transforms can be directly derived from
inferred latent codes. Our proposed SLAM paradigm, using NeuSE for object shape
and pose characterization, can operate independently or in conjunction with
typical SLAM systems. It directly infers SE(3) camera pose constraints that are
compatible with general SLAM pose graph optimization, while also maintaining a
lightweight object-centric map that adapts to real-world changes. Our approach
is evaluated on synthetic and real-world sequences featuring changed objects
and shows improved localization accuracy and change-aware mapping capability,
when working either standalone or jointly with a common SLAM pipeline.Comment: 15 Pages and 12 figures. Accepted to RSS 2023. Project webpage:
https://neuse-slam.github.io/neuse
Robust Change Detection Based on Neural Descriptor Fields
The ability to reason about changes in the environment is crucial for robots
operating over extended periods of time. Agents are expected to capture changes
during operation so that actions can be followed to ensure a smooth progression
of the working session. However, varying viewing angles and accumulated
localization errors make it easy for robots to falsely detect changes in the
surrounding world due to low observation overlap and drifted object
associations. In this paper, based on the recently proposed category-level
Neural Descriptor Fields (NDFs), we develop an object-level online change
detection approach that is robust to partially overlapping observations and
noisy localization results. Utilizing the shape completion capability and
SE(3)-equivariance of NDFs, we represent objects with compact shape codes
encoding full object shapes from partial observations. The objects are then
organized in a spatial tree structure based on object centers recovered from
NDFs for fast queries of object neighborhoods. By associating objects via shape
code similarity and comparing local object-neighbor spatial layout, our
proposed approach demonstrates robustness to low observation overlap and
localization noises. We conduct experiments on both synthetic and real-world
sequences and achieve improved change detection results compared to multiple
baseline methods. Project webpage: https://yilundu.github.io/ndf_changeComment: 8 pages, 8 figures, and 2 tables. Accepted to IROS 2022. Project
webpage: https://yilundu.github.io/ndf_chang
- …